External validation of the hospital frailty risk score among older adults receiving mechanical ventilation

To externally validate the Hospital Frailty Risk Score (HFRS) in critically ill patients. We selected older adult (≥ 75 years old) hospitalizations receiving mechanical ventilation, using the Nationwide Readmissions Database (January 1, 2016-November 30, 2018). Frailty risk was subcategorized into low-risk (HFRS score < 5), intermediate-risk (score 5–15), and high-risk (score > 15). We evaluated the HFRS to predict in-hospital mortality, prolonged hospitalization, and 30-day readmissions, using multivariable logistic regression, adjusting for patient and hospital characteristics. Model performance was assessed using the c-statistic, Brier score, and calibration plots. Among 649,330 weighted hospitalizations, 9.5%, 68.3%, and 22.2% were subcategorized as low-, intermediate-, and high-risk for frailty, respectively. After adjustment, high-risk patient hospitalizations were associated with increased risks of prolonged hospitalization (adjusted odds ratio [aOR] 5.59 [95% confidence interval [CI] 5.24–5.97], c-statistic 0.694, Brier 0.216) and 30-day readmissions (aOR 1.20 [95% CI 1.13–1.27], c-statistic 0.595, Brier 0.162), compared to low-risk hospitalizations. Conversely, high-risk hospitalizations were inversely associated with in-hospital mortality (aOR 0.46 [95% CI 0.45–0.48], c-statistic 0.712, Brier 0.214). The HFRS was not successfully validated to predict in-hospital mortality in critically ill older adults. While it may predict other outcomes, its use should be avoided in the critically ill.

Measurements. The covariates in the NRD included age, biological sex, hospital characteristics (teaching status, size), income quartile, primary insurance status (Medicare, Medicaid, private insurance, self-pay, or other) and Elixhauser-van Walraven comorbidity index score 25 . The ICD-10-CM and ICD-10-PCS codes were used to classify comorbidities (ESM eTable 1). We determined the primary reason for admission of the index hospitalization and readmission, using the first listed diagnosis (DX1) and aggregate groups of the Clinical Classifications Software Refined (CCSR) developed by HCUP (ESM eTable 2) 26 . Hospital costs were determined using total hospital charges multiplied by the all-payer cost-to-charge ratio, then inflation-adjusted to 2018 US dollars using the US Bureau of Labor Consumer Price Index for medical care 27,28 . Linked visits were identified through a linking variable. Frailty risk. Frailty risk was assessed using the HFRS developed by Gilbert et al. (ESM eTable 3) 16 . We classified patients as either low-risk (score < 5), intermediate-risk (score 5-15), or high-risk (score > 15) for frailty, based on the original HFRS study and subsequent validation studies 16-18 . Outcome(s). We evaluated the performance of the HFRS to predict in-hospital mortality, as the primary outcome. The predictive performance of the HFRS for prolonged hospitalization and 30-day emergency hospital readmissions were evaluated as secondary outcomes. We only evaluated in-hospital all-cause mortality instead of 30-day mortality (inpatient or outpatient) because the NRD only records in-hospital deaths. We defined prolonged hospitalization as a hospital length of stay > 10 days and only evaluated 30-day emergency hospital readmissions, similar to Gilbert et al 16,17 . Statistical analysis. All statistical analyses were performed using Stata/MP 15.1 (College Station, Texas, US). A two-sided p value < 0.05 was considered statistically significant. We accounted for the complex sampling design of the NRD using sampling weights provided by HCUP 23 . Categorical variables were presented as unweighted numbers and weighted percentages. Continuous variables were presented as either means (standard deviation [SD]), or medians (interquartile range [IQR]), following testing for normality. Survey-specific Rao-Scott tests were used to compare nominal data. Survey-specific linear regression was used to compare continuous data, using the geometric means for non-normal data. Missing data were present in < 5% of all patient visits. As a result, a complete case analysis was performed for all analyses given the complex sampling design 29,30 .
We assessed the validity of the HFRS for predicting in-hospital mortality, prolonged hospitalization, and 30-day emergency hospital readmission, using unadjusted and adjusted logistic regression. For in-hospital mortality and prolonged hospitalization, we performed adjustment for age, biological sex, income quartile, insurance status, do-not-resuscitate status, admission diagnosis, hospital characteristics, and year. For 30-day emergency

Sensitivity analyses.
We performed several sensitivity analyses to assess the robustness of our findings.
First, we re-evaluated our findings using the HFRS as a continuous variable and using restricted cubic splines with five knots 33 . Next, we performed survey-specific Cox proportional hazards regression for in-hospital mortality and 30-day emergency hospital readmissions 34 . Subsequently, we derived 30-day in-hospital mortality, using hospitalization data from the NRD, and re-performed our primary analysis. We performed additional post hoc analyses, restricting the population to those who only received mechanical ventilation for greater than 24 h and restricting the population to only those who were admitted emergently. Additionally, subgroup analyses were performed for patients who received major operative procedures and those who did not. We also performed an additional sensitivity analysis adjusting for time receipt of mechanical ventilation. We then performed multiple imputation with chained equations for missing data using 10 imputations, and repeated the primary analysis with the imputed dataset 35 . Finally, as a post hoc analysis, we evaluated the total population of older adults in the NRD, independent of receiving mechanical ventilation, to determine whether our findings held for the entire older adult population.
Ethics approval and consent to participate. This   www.nature.com/scientificreports/ of all hospitalizations (Table 1). Of survivors, 20.3% were readmitted to hospital by 30 days. Among high-risk for frailty hospitalizations, they had an increased incidence of prolonged hospitalization and 30-day emergency hospital readmissions (all p < 0.001) compared to the low-risk for frailty group. However, they had a reduced incidence of in-hospital mortality compared to other frailty groups (p < 0.001).

Assessment of model performance.
Model performance was assessed for in-hospital mortality, prolonged hospitalization, and 30-day emergency hospital readmission (

Hospital characteristics
Hospital Sensitivity analyses. Detailed information on the sensitivity analyses is available in the ESM eResults and in eTable 6-e16. We performed several different analyses to evaluate the robustness of our analysis method, including re-analyzing our data using the HFRS as a continuous variable (ESM eTable 6) or using restricted cubic splines with five knots (ESM eTable 7, Fig. 3), performing Cox proportional hazards regression (ESM eTable 8), evaluating in-hospital 30-day mortality (ESM eTable 9), and performing multiple imputation with chained equations (ESM eTable 15). These additional analyses did not alter our overall findings.

Discussion
In this study, the primary objective was to externally validate the HFRS to accurately predict in-hospital mortality in a large nationally representative cohort of older adults receiving mechanical ventilation. In its current form, the HFRS could not be successfully validated for use in this population. As expected, we found that patient hospitalizations receiving mechanical ventilation with intermediate-and high-risk for frailty, as categorized by the HFRS, were associated with increased risks of prolonged hospitalization and 30-day emergency hospital readmissions, compared to low-risk hospitalizations. Counterintuitively, they were inversely associated with inhospital mortality when compared to the low-risk hospitalizations, suggestive of a potential spurious relationship. Regardless, the HFRS had only moderate discrimination and accuracy in predicting any of these outcomes. Using the HFRS as a continuous variable or with splines did not provide additional value over using the HFRS subcategories of low-, intermediate-, and high-risk. Our findings would suggest that clinicians and researchers should avoid using the HFRS when conducting big data research with administrative datasets of critically ill patients. www.nature.com/scientificreports/ study of 1,498 patients evaluated the HFRS to predict a combined endpoint of mortality and risk of readmission and found no association after adjustment for severity of illness 21 . In a large Wales population study, the HFRS had only moderate ability for predicting inpatient, 6-month, and 1-year mortality in hospital and ICU patients 41 .
Conversely, a US study of 12,854 patients, using the single-center Medical Information Mart for Intensive Care (MIMIC-III) database, found that higher HFRS was associated with an increased risk of 28-day mortality 40,42 . In our study, we found that critically ill older adult hospitalizations receiving mechanical ventilation were at high-risk of poor outcomes, including prolonged hospitalization (41%), 30-day in-hospital mortality (44%), in-hospital mortality (45%), and 30-day emergency hospital readmission (20%). Unsurprisingly, palliative care utilization was very high at 26.8%, with higher use in the high-risk frailty groups. The overall readmission rate was high in the patients of this study, suggestive of current difficulties in transitions in care for these patients and potential room for quality improvement.
Prior studies of critically ill patients have established that frailty is associated with increased risks of mortality 3,4 . Counterintuitively, we found that the HFRS was inversely associated with mortality in the NRD (i.e., lower HFRS was associated with the highest risks of in-hospital mortality). To ascertain this surprising and unnatural finding, we performed a post hoc analysis on the entire NRD population of older adults, independent of the receipt of mechanical ventilation, and found that the HFRS performed well on the whole population (i.e., higher HFRS was associated with the highest risks of in-hospital mortality in all older adults) (ESM eTable 13).
There may be some possible explanations for this unusual phenomenon, including selection biases and coding biases. In Gilbert et al. 's original study, they validated the HFRS in a general hospitalized population to predict in-hospital mortality 16 . In general, critically ill patients are at higher risk of death compared to a general hospitalized population, representing a surrogate endpoint. Therefore, by limiting our population to mechanically ventilated patients, selection bias may have been introduced, potentially altering the true association of the HFRS and mortality. Coding biases may also occur as critically ill patients who had prolonged hospitalizations and/or survived their hospitalization may appear to more "frail, " as they accrue more ICD-10-CM secondary diagnoses captured in their medical records. In the NRD, most of the hospitalizations of older adults receiving mechanical ventilation were in the intermediate-risk frailty group, and most hospitalizations in the high-risk group had significantly more ICD-10-CM codes captured compared to hospitalizations in the low-risk group. Finally, frail patients with higher severity of illness or those with treatment limitations may choose less invasive treatments, introducing further selection bias. We did adjust for do-not-resuscitate status; however, this may not fully capture all treatment limitations. www.nature.com/scientificreports/ These biases and differences in the ICU patient population from the original development cohort could potentially explain why the HFRS had mixed performances for predicting in-hospital mortality in an ICU patient population, as seen in this study and others.

Strengths and limitations.
Our study had several strengths including the use of a large multicentre dataset, comprising close to 650,000 weighted hospitalizations. To our knowledge, our study represents one of the largest studies of critically ill patients examining the use of the HFRS, allowing for generalizability of our findings www.nature.com/scientificreports/ to critically ill older adults receiving mechanical ventilation. Unlike prior external validation studies in critically ill administrative databases, we evaluated the HFRS to predict prolonged hospitalization and 30-day emergency hospital readmissions. Additionally, we assessed both model discrimination and calibration, allowing for confidence in the results presented. Finally, our study performed several sensitivity analyses to verify our findings. However, our study has limitations. As discussed previously, selection bias may have occurred in our selection of a mechanically ventilated population. The NRD was not designed specifically to flag admissions for critical care. Hence, the identification of critically ill patients was done through ICD-10 codes, specific to mechanical ventilation. Other codes, such as vasopressor use, are known to be significantly undercoded in administrative databases 43 . As the HFRS is derived from a composite of ICD-10 codes, coding practices and biases may affect the relative prevalence of admission comorbidities, diagnoses, and treatments. Some important codes to the determination of the HFRS, such as dementia in Alzheimer's disease (F00) or care involving the use of rehabilitation procedures (Z50), were undercoded (ESM eTable 3). This is similarly seen in other databases including the Centers for Medicare & Medicaid Services and National Inpatient Sample databases 36,37 . Other databases of critically ill patients may perform differently, depending on their coding practices.
Additionally, the NRD does not have sufficient information to determine ICU severity of illness, such as the sequential organ failure assessment (SOFA) or Acute Physiologic Assessment and Chronic Health Evaluation II (APACHE II) scores. We are therefore unable to verify whether the HFRS would perform better after adequate adjustment for severity of illness; however, other studies would suggest that the HFRS does not perform well even after adjustment for severity of illness 21 . Likewise, the NRD does not capture detailed clinical information (i.e., patient weight, vasopressor dosing), and while it collects information on length of mechanical of ventilation, this information is often incomplete. Furthermore, it does not record out-of-hospital deaths, limiting our ability to only evaluate in-hospital mortality. Finally, we did not evaluate other scores as this was beyond the scope of our study. These limitations highlight the difficulty in applying the HFRS to datasets of critically ill patients and further support our caution on avoiding the use of the HFRS to predict these outcomes.
Clinical implications, research implications, and future directions. Clinicians need to have accurate predictions of frailty and outcomes to identify patients who would benefit from early geriatric medicine referral, as well as to engage with patients and their families in shared decision-making, goals of care discussion, and end-of-life planning, and/or palliative care referral. Likewise, healthcare administrators need to have accurate estimates of the number of frail patients to plan and allocate healthcare services. Big data researchers need accurate scores to classify patients correctly.
While the HFRS may have utility in non-ICU databases, our study demonstrates its limitations in critically ill patients. The mFI is a promising alternative; however, it needs further development and validation for use with ICD-10-CM codes 15,44 . Perhaps the better solution for clinicians, researchers, and administrators would be to adapt and transform existing databases for frailty research. With other well-validated frailty scores such as the CFS, there is a compelling argument for its integration into routine clinical practice and inclusion in data capture. Future research should be performed to re-develop the HFRS or other scores with different weighting specifically for critically ill patients.

Conclusion
In this large nationally representative external validation study of older adults receiving mechanical ventilation, the HFRS could not be validated to predict in-hospital mortality in this population. While the HFRS may predict prolonged hospitalization and 30-day emergency hospital readmissions, its use should be avoided in the critically ill. Further research with administrative databases is necessary to develop accurate, intuitive frailty scores in critically ill patients.

Data availability
The Nationwide Readmissions Database is available through the Healthcare Cost and Utilization Project (https:// www. hcup-us. ahrq. gov/ nrdov erview. jsp).